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This paper describes the Quality Measurement Plan (qmp), a 
recently implemented system for reporting the quality assurance audit 
results to Bell System management, qmp replaces the T-rate system, 
which evolved from the pioneering statistical work of Shewhart and 
Dodge during the 1920's and 1930's at Bell Laboratories. Box and 
whisker plots are used for graphically displaying confidence intervals 
for the quality of the current production. The confidence interval is 
computed from both current and past data and is derived from a new 
Bayesian approach to the empirical Bayes problem for Poisson 
observations. Here we discuss the rationale, mathematical deriva- 
tions, dynamics, operating characteristics, and many comparative 
examples. We show that qmp reduces statistical errors relative to the 
earlier T-rate system. 

I. INTRODUCTION 
1.1 Quality assurance 

The responsibility of the Bell Laboratories Quality Assurance Center 
(qac) is "to ensure that the communications products designed by Bell 
Laboratories and bought by Bell System operating companies from 
Western Electric Company, Incorporated will meet quality standards 
and will perform as the designers intended." 1 This obviates the need 
for each operating company to carry out its own acceptance inspection. 

To meet this responsibility, the qac works with its Western Electric 
(WE) agents, the Quality Assurance Directorate (qad), 2 and Purchased 
Products Inspection (ppi) organizations. However, as stated in Ref. 1, 
"The primary responsibility for quality lies with the line organizations: 
Bell Laboratories for the quality of design and Western Electric for 
the quality of manufacture, installation, and repair." The quality 
assurance organizations conduct independent activities to assure qual- 
ity to the operating companies. 
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1.2 Quality assurance audit 

The quality assurance organizations have two major activities. The 
first is to conduct quality audits where products change hands, either 
within WE or between WE and the operating companies. Examples 
are manufacturing, installation, and repair audits. The second concerns 
a collection of field quality monitoring activities. Examples are the 
Product Performance Surveys. These are designed sample surveys of 
reported field troubles. 

An audit is a highly structured system of inspections done on a 
sampling basis. The ingredients of an audit are: (i) sampling method, 
(ii) scope of inspection, (Hi) quality standards, (iv) nonconformance 
procedures, (v) defect assessment practices, (vi) quality rating method, 
and (vii) report formats. 

The sampling method along with the scope of inspection determines 
what tests will be performed on what units of product or attributes of 
product. The statistics and economics of sampling, the engineering 
requirements, and the field effect of defects play the central roles in 
determining the sampling and the scope of inspection. 

The quality standards are numerical values expressed in defects, 
defectives or demerits per unit. They are set by the qac in consultation 
with the qad. The standards are target values, reflecting a tradeoff 
between first cost and maintenance costs. 

The nonconformance procedures are rules for detecting and dispos- 
ing of audited lots that are excessively defective with respect to a 
particular set of engineering requirements. 

The defect assessment practices are a set of transformations that 
map defects found into defects assessed for quality rating purposes. A 
terminal strip may have all ten connections off by one position, but, 
the consequences of these ten defects found are much less than ten 
independent occurrences of this defect. Therefore, less than ten defects 
are assessed. 

The quality rating method and report formats determine how the 
results of the audit are presented to Bell System management. For 
example, a product is reported as "Below Normal," when it fails a 
statistical test of the hypothesis that the quality standard is being met. 

1.3 The quality measurement plan (QMP) 

The statistical foundations of the audit ingredients were developed 
by Shewhart, Dodge, and others, starting in the 1920's and continuing 
through to the middle 1950's. This work was documented in the 
literature in Refs. 3 to 6. 

In recent years, research has been carried out to evaluate the 
application of modern statistical theories to the audit ingredients. An 
important idea is summarized in an article by Efron and Morris 7 which 
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explains a paradox discovered by Stein. 8 When you have samples from 
similar populations, the individual sample characteristics are not the 
best estimates of the individual population characteristics. Total error 
is reduced by shrinking the individual sample characteristics part way 
towards the grand mean over all samples. Efron and Morris used 
baseball batting averages to illustrate the point. But the problem of 
estimating percent defective in quality assurance is the same problem. 
And you are always concerned with similar populations — for example, 
the population of design-line telephones produced for each of several 
months. 

This idea was originally explored in Ref. 9. The idea has now evolved 
into the Quality Measurement Plan (qmp). qmp is the recently imple- 
mented system for conducting three of the audit ingredients: defect 
assessment, quality rating, and quality reporting. 

As a quick introduction to qmp, consider Fig. 1. This is a comparison 
of the qmp reporting format (Fig. la) with the old T-rate reporting 
format (Fig. lb). Each year is divided into eight periods. On the 
bottom, the T-rate is plotted for each period and it measures the 
difference between the observed and standard defect rates in units of 
sampling standard deviation (given standard quality). The idea is that 
if the T-rate is, e.g., less than negative two, then the hypothesis of 
standard quality is rejected. Section II considers the exact rules for 
exception reporting under the T-rate system. 

Under qmp, a box and whisker chart is plotted each period. The box 
chart is a graphical representation of the posterior distribution of 
current population quality on an index scale. The index value one is 
the standard on the index scale and the value two means twice as 
many defects as expected under the quality standard. The posterior 
probability that the population index is larger than the top whisker is 
0.99. The top of the box, the bottom of the box, and the bottom 
whisker correspond to the probabilities 0.95, 0.05, and 0.01, respec- 
tively. 

The heavy "dot" is a Bayes estimate of the long run process average; 
the "cross" is the observed value in the current sample; and the "dash" 
near the middle of the box is the posterior mean of the current 
population index and is called the Best Measure of current quality. 
The process averages, "dots," are joined to show trends. 

Although the T-rate chart and the qmp chart often convey similar 
messages, there are differences. The qmp chart provides a measure of 
quality; the T-rate chart does not. For example, in 7806 (Period 6 of 
1978) both charts imply that the quality is substandard, but the qmp 
chart also implies that the population index is somewhere between one 
and two. 

qmp and the T-rate use the past data in very different ways, qmp 
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Fig. 1 — qmp versus the T-rate. The box and whisker plot in (a) is the qmp replacement 
of the T-rate. One is standard on the index scale; two is a defect rate of twice standard. 
The box and whisker are 90 and 98 percent confidence intervals for production during 
the period; the "crosses" are the indices in the samples; the "dots" are process averages; 
and the "dashes" in the middle of the boxes are Best Measures of current quality derived 
from empirical Bayes theory, (b) is a- time series of T-rates. Each point measures the 
difference between observed quality and expected quality on a standard deviation scale. 
Notice that the sixth period of 1977 and the fourth period of 1978 are the same in the T- 
rate chart but quite different in the qmp chart. 



uses the past sample indices, but makes an inference about current 
quality. The T-rate system uses runs criteria based on attributes of 
the T-rate, such as "less than zero," and can make an inference about 
past quality. In Fig. 2, 7707, the T-rate signals an exception, because 
six T-rates in a row are less than zero, indicating that quality has not 
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been standard for all six periods. But for qmp, the standard is well 
within the box, indicating normal current quality. The different treat- 
ment of past data is also illustrated in Fig. 1. Comparing 7706 with 
7804 reveals very similar T-rates, but qmp box charts with different 
messages. 

The T-rate system is based on the assumption that the total number 
of defects in a rating period has a normal distribution, qmp is based on 
the Poisson distribution. This difference is important for small audits, 
as shown in Section VII. 

qmp was on trial for two years and was applied to 20,000 sets of 
audit data. The relatively simple qmp algorithm published in Ref. 9 
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Fig. 2 — A weak T-rate exception. The seventh period of 1977 was reported as a quality 
exception because six T-rates in a row were less than zero. For qmp, it would have been 
reported as normal. This is because qmp provides a statistical inference about current 
production only, even though past data is used. 
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was used originally. This simple algorithm worked for most data sets, 
but not all (e.g., zero defects in every period). The relatively complex 
algorithm discussed in Section IV is the result of a lengthy fine-tuning 
process, designed to make the algorithm work for every case. This is 
why the full power of Bayes theorem with empirically based prior 
distributions had to be used. 

1.4 Relationship to the empirical Bayes approach 

Note that in the qmp box chart, the Best Measure always lies 
between the estimated process average and the current sample index. 
The Best Measure is a shrinkage of the sample index towards the 
estimated process average. In 7706 of Fig. la, the shrinkage is away 
from standard; but, in 7804, it is towards the standard. 

The Best Measure is related to the class of estimators described by 
Efron and Morris. 10 In the cited reference, they provide a foundation 
for Stein's paradox with an empirical Bayes approach. In Ref. 7, they 
used baseball data to illustrate Stein's paradox. There is a clear analogy 
between percent defective in a quality assurance application and a 
baseball batting average. The data in Ref. 7 was for many players at 
a given point in time. The qmp algorithm works with the data for one 
product over time. So a better baseball analogy would be one player 
over time. 

Table I contains batting average data for Thurman Munson from 
1970 through 1978. This data was collected and analyzed by S. G. 
Crawford and is displayed graphically in Fig. 3. The "crosses" are 
Munson's batting averages reported on the last Sunday of April for 
each year. The "boxes" are Munson's batting averages at the end of 
the season. The dashed line is the average of the "crosses." 

The early season averages are analogous to the audit data. The 
averages are the results from small samples of the populations. The 
populations are the finite populations of "at bats" for each season. In 

Table I — Batting average data for Thurman Munson 
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0.302 


0.165 


1971 


37 


6 
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0.280 
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* AP statistics. 



220 THE BELL SYSTEM TECHNICAL JOURNAL, FEBRUARY 1981 



0.400 



0.350 - 



0.300 - 



AGGREGATE EARLY 
SEASON AVERAGE — 
OVER TIME 



0.250 - 



0.200 - 



0.150 





X 


_ 


X X 




■ 


x 

V ■ V . 

■ ■ A ■ 


A " 

■ 


t 




■ 


K 


_ 


I ! 




- 




X 

X 

■ SEASON AVERAGE 


y X * EARLY SEASON AVERAGE 
* Ill 



1970 1971 1972 1973 1974 1975 1976 1977 1978 



Fig. 3— Batting averages for Thurman Munson. For each year, the movement from 
the early season average (the sample) to the season average (the population) is always 
in the direction of the time average of the samples. This suggests strongly that by 
shrinking the samples towards their time average, one can obtain improved estimates of 
the populations. 



the audit, we are interested in making a statistical inference each 
period about the current population. So our problem in Fig. 3 is to 
make a statistical inference each year about the season batting average 
using only the early season averages observed to date. 

As an estimate, one would be tempted to use the maximum like- 
lihood estimate, the early season average. But, in every year, the 
movement of the batting average from the early season to the season 
end is in the direction of the aggregate early season average over time. 
So, paradoxically, the early season averages from other years seem to 
be relevant to the current season average. It is clear from the data, 
that a better estimate of the season average is some kind of shrinkage 
of the early season average towards the aggregate early season average 
over time. And the amount of shrinkage can depend only on the 
available data — the early season averages. 

What we really have here is a multivariate problem. We observe a 
nine-dimensional vector of observations whose mean is a vector of 
population characteristics, one of which we are particularly interested 
in. Stein 8 showed (for the normal distribution) that the maximum 
likelihood estimate of the vector is inadmissible. Why this is true 
manifests itself in baseball lore. A player that starts the season rela- 
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tively hot, usually cools off; and a player that starts in a relative slump 
usually improves. This is due to the nature of sampling error. The hot 
player is usually partially lucky and the slumping player is usually 
partially unlucky. 

qmp is based on the concepts illustrated by the Munson data. We 
saw in Fig. la that the Best Measure of the population index is between 
the current observed index and the estimated long-run process average. 

The approach used for qmp is actually Bayesian empirical Bayes. 
The shrinkage factor used is a Bayes estimate of an optimal shrinkage 
factor. So the Best Measure has the form 

[estimated! [current 

W process + (1 — W) sample 
average J index 

where W is a Bayes estimate of 

[sampling variance] 



[sampling variance] + [process variance] 

The bigger the sampling variance is, relative to the process variance, 
the more weight is put on the estimated process average. 

There are two advantages to the Bayesian empirical Bayes approach 
over the approach in Ref. 10. One is that the weight, W, is always 
strictly between zero and one. This is because W is a Bayes estimate 
of an unknown optimal weight, w, which has a nondegenerate posterior 
distribution on the interval [0, 1]. The approach taken in Ref. 10 is to 
use maximum likelihood estimates of to, which can be one; i.e., total 
shrinkage to the process average. 

The second advantage is that an interval estimate of the current 
population index can be constructed from its posterior distribution. 
Most of the literature (e.g., Ref. 10) treats the estimation problem 
thoroughly, but it provides little guidance for the interval estimation 
problem. 

The qmp algorithm is applied to the Munson data and the qmp 
estimates of the season averages are given in Table I. The sums of the 
absolute errors for the maximum likelihood estimates (April averages) 
and the qmp estimates are 0.603 and 0.331, respectively — a forty-five 
percent improvement. Notice that the qmp estimates for 1970 and 1971 
are close to the April averages. This is because there was no history on 
Munson. The reduction in total absolute error for the years 1973 
through 1978 was sixty-five percent, because of the benefit of history. 

1.5 Objectives 

This paper is intended to document qmp. It contains the rationale 
for changing the rating system, a synopsis of qmp features, mathemat- 
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ical derivations of the rating formulas, the dynamics of qmp, the 
operating characteristics of QMP, many examples, and the qmp report- 
ing format. 

Readers who are interested only in the mathematics of qmp and how 
it relates to empirical Bayes, may skip Section II. Readers, who are 
not interested in the mathematical derivation of qmp, may skip Section 
IV. 

II. T-RATE SYSTEM 

To understand the rationale behind QMP, one must first understand 
the T-rate system. From this we shall see where things have changed 
and where things have remained the same. 

2. 1 Finding defects 

The sampling methods along with the scope of inspection provide 
for a sample of units of count for each set of inspections. A unit of 
count is either a unit of product or a unit of a product's attribute such 
as solderless wrapped connections. 

The result of conducting a set of inspections is a list of defects found 
and their descriptions. Frequently, underlying a defect is a variable 
measurement* that falls outside a range, qmp does not affect the 
process of finding defects. 

2.2 Assessing defects 

The defects found sometimes occur in clusters for which the effect 
of the cluster is nonadditive; i.e., the effect is less than the sum of the 
effects of the individual defects occurring by themselves. In this case, 
the number of defects assessed for rating purposes is less than the 
number found. The defect assessment practices for the T-rate system 
evolved over a 50-year period, so these practices were based on a 
variety of criteria and engineering judgements. The defect assessment 
practices under qmp amount to a redesign of the practices using a 
single principle, which is described in Section 3.1. 

2.3 Defect weighting and demerits 

The defects assessed are transformed into demerits or defectives or 
may remain as simple unweighted defects. In an audit based on 
demerits, each defect assessed is assigned a number of demerits: 100, 
50, 10, or 1 for A, B, C, or D weight defects, respectively. Guidelines for 
assigning demerit weights are contained in numerous general and 



* For rating transmission characteristics of exchange area cable, some variables 
measurements are used directly without conversion to defects. We do not treat this case 
here. 
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special purpose demerit lists. The principles underlying these demerit 
lists are described by Dodge in Refs. 5 and 6. In an audit based on 
defectives, all defects found in a unit of product are analyzed to 
determine if the unit is considered defective. The assessment is either 
one or zero defectives. These transformations to demerits, defectives, 
or defects are not affected by qmp. 

2.4 Quality standards 

For any set of inspections, the quality engineers in the qac have 
established quality standards. To do this, they considered audit scope, 
shop capability, field performance, economics, complexity, etc. The 
philosophy of standards is described in Ref. 3. For audits based on 
defects or defectives, the standards are expressed in defects or defec- 
tives per unit. For audits based on demerits, the standards are derived 
from fundamental defect per unit of count standards for A, B, C, D- 
type defects. In addition, we use Poisson as the standard distribution 
of the number of type A defects (for example). 

To make this clear, let's consider a simple example. Suppose in a 
sample of size n, there are Xa,Xb, Xc, Xo-type A, B, C, D defects. The 
definition of standard quality is that Xa, Xb, • • • are independent and 
have Poisson distributions with means fiXa, tlKb, • • •. The number of 
demerits in the sample is 



D = 100X4 + 50X fi + lOXc + X D . 



The mean and variance of D, given standard quality, are 

E(D I S) = 100(mA,i) + 50(nA B ) + • • ■ 

= n[100\ A + 50A B + lOAc + A D ] 

= nU s 

V(D\S) = (lOO) 2 ^) + (50) 2 (nA B ) + ■ ■ ■ 

= n[(100) 2 \ A + (50)% + (10) 2 A C + A D ] 

= nC a . 

The notation "D | S " reads "D conditional on S." 

Note that U s is the demerit per unit standard and C 8 is a variance 
per unit standard. These are the numbers that would be published in 
the official list of standards called the Master Reference list. The 
quality standards are not affected by qmp. 

2.5 Rating classes and periods 

For the purpose of reporting quality results to management, the 
products are grouped into rating classes. An example is: ess No. 1 
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wired equipment, functional test, at Dallas.* The results of all the 
inspections associated with this rating class are aggregated over a time 
period called a rating period. A rating period is about six weeks long 
and there are eight rating periods per year, qmp does not affect the 
rating classes or periods. 

2.6 The T-rate 

The advantage of having quality standards is that observed quality 
results can be statistically compared to the standards. In the T-rate 
system, this is done with a statistic called the T-rate. 

For a given rating class, let Q denote the total number of defects, 
defectives, or demerits that are observed in all the inspections con- 
ducted on all the subproducts during a rating period. Because there 
are quality standards for each set of inspections on each product 
subclass, it is possible to compute the standard mean and variance of 
Q, denoted by E(Q \ S), V(Q \ S ). The T-rate is 

E(Q\S)-Q 
T-rate = — , 

>/V(Q\S) 

It measures the difference between the observed result and its standard 
in units of statistical standard deviation. 

For each rating period, the T-rate is plotted in the control chart 
format shown in Fig. lb. The control limits of ±2 are reasonable under 
the assumption that Q has an approximate normal distribution. Then 
the standard distribution of Q is the "standard normal," and excursions 
outside the control limits are rare under standard quality. For large 
audit sample sizes, this approximation follows from the central limit 
theorem. As we shall see, the approximation is poor for small sample 
sizes. 

2. 7 Reports, Below Normals and ALERTs 

The fundamental reports to WE management are books of T-rate 
control charts for all rating classes. However, every rating period, a 
summary booklet is prepared. The summary consists of various aggre- 
gate quality performance indices and an exception report which lists 
rating classes that are having quality problems. 

There are two kinds of exceptions: Below Normal (bn) and alert. 
These are based on statistical tests of the hypothesis that quality is at 
standard. The rules for bn and alert are based on six consecutive T- 
rates, t u ••• t 6 , where t G is the current T-rate. The rules use the 



* Technically, this is called a scoring class in quality assurance documentation. Here, 
rating class means scoring class. 
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following runs criteria: 

scan(S): U <0, • ••, £ 6 <0, 

341 (T): & < — 1 and at least two of the set {£3, U, h) are less than 
-1. 

Finally, the rules for bn and alert are: 
Below Normal (bn): One of the following two conditions is satisfied: 

(1) t 6 < -3 

(2) — 3 < t% < —2 and at least one of the following three conditions 
hold: 

(«) SCAN 

(ii) 341 

(Hi) At least one of the set {£2, h, U, t 5 ) is less than —2. 
alert: scan or 341 but not bn. 

In Fig. lb, examples of bn are 7806 and 7803. Examples of alert are 
7808 and 7804. 

Both the fundamental report formats and the rules for bn and alert 
are different under qmp 

2.8 Pros and cons of the T-rate 

The advantage of the T-rate is its simplicity. It can be calculated 
manually. Exceptions can be identified by inspection. The fact that 
the T-rate has been used for so long is a testimonial to its advantages. 

However, the T-rate does have problems.* The T-rate does not 
measure quality. A T-rate of —6 does not mean that quality is twice as 
bad as when the t-mte is —3. The T-rate is only a measure of statistical 
evidence with respect to the hypothesis of standard quality. This 
subtle statistical point is often misunderstood by report readers. Years 
of explanations have not cleared up the confusion. 

Another problem is that the alert (scan and 341) rules are tests of 
hypothesis about quality trends, not current quality. Consider Fig. 2. 
You can assert that quality was probably substandard sometime 
between 7702 and 7707. You cannot, however, assert that quality is 
substandard in 7707. The qmp result for 7707 is normal. 

In addition, the rules for alert and to some extent bn depend on 
attributes of past T-rates rather than their exact values. For example, 
five consecutive past T-rates at —1.0 are treated exactly like five 
consecutive T-rates at —0.1. This was done for statistical robustness. 
But statistical information is lost. There are no "outliers" in the audit 
data. Defects assessed were in the product. Many defects assessed 



* Although the foundation of the T-rate system was laid by Shewhart 4 and Dodge, 5 
the details are the results of contributions by many people over 50 years. 
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mean substandard quality at the time of assessment. It is possible that 
very unusual circumstances caused the defects. But it is intended that 
the audit flag such unusual circumstances. 

The significance level of the T-rate hypothesis test depends on 
sample size and can be very large. Suppose that we have a simple test 
defect audit with a sample size of 32 units and a standard of 0.005 
defects per unit. The expected number of defects is (32) (0.005) = 0.16. 
For one defect observed, the T-rate is 

V016 

So, every time there is a defect, the T-rate exceeds the control limit. 

Now, assuming standard quality, the number of defects has a Poisson 
distribution with mean = 0.16. The Poisson probability of one or more 
defects is 0.15. So even when the standard is being met, there is a 15 
percent chance of the T-rate dropping below -2.0. In statistical terms, 
we have a biased test (i.e., there is no reasonable upper bound on the 
significance level). 

Clearly, it is not reasonable to take action every time the audit finds 
a defect. So special rules called modification treatments have evolved 
to handle cases like the one just described. Some of these modification 
treatments are statistically sound, others are not. This detracts from 
the desired objectivity of our quality rating. 

In a sense, qmp is orthogonal to the T-rate. On the one hand, qmp 
cannot easily be computed manually. On the other hand, qmp does not 
have any of the disadvantages described above for the T-rate. The 
basic message of the qmp box chart (Fig. la) is unambiguous and 
exceptions can be identified by inspection. 

III. OVERVIEW OF QMP 

As described in the introduction, qmp is the new way of conducting 
three of the audit ingredients: defect assessment, quality rating, and 
quality reporting. This section contains an overview of qmp. Mathe- 
matical derivations and detailed analyses of qmp are left for later 
sections. 

3. 1 Defect assessment practices 

Defect assessment practices have two parts. Part one is a description 
of those situations where fewer defects are assessed than are found. 
Part two is a formula for the number of defects assessed. 

In qmp, the principle for part one is: Normally all defects found in 
the quality assurance audits are assessed. Occasionally a cluster of two 
or more defects is found for which the seriousness of the cluster is less 
than the seriousness implied by individually assessing every defect in 
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the cluster. Such a cluster shall be called reducible. Seriousness is 
measured from the customer's point of view. The audit attempts to 
measure seriousness as if the auditor is the customer. So if defects are 
found and corrected as a result of the audit, no adjustment in assess- 
ment is necessary. More specifically, a reducible cluster is a collection 
(on one audited unit) of 

[1] dependent identical defects that the customer will 
[2] almost surely discover in its entirety when a small part of the 

cluster is discovered and 
[3] will correct or otherwise account for en masse, so that 
[4] total seriousness is better represented by assessing d a defects 
(computed by the assessment formula), rather than the number 
found. 

In [1], we use the word dependent in a statistical sense. Defects are 
dependent if they occur in a short interval of time and are systemati- 
cally introduced by a common feature of the production process. 

Ideally, the assessment associated with a reducible cluster of defects 
should depend on the situation. Over time, lists of reducible clusters 
and their assessments could be catalogued and added to the demerit 
lists. But, for now, there is no list of reducible clusters, so an assessment 
formula is needed. 

For qmp, the assessment formula has the general form 

d a = AN + 1, 

where an stands for "Allowance Number." In turn, an has the general 
form an = e + 3>/e, rounded down to an integer or to the closest 
integer, where e can be interpreted as an expected number of defects. 
The computation of e and the rounding depend on the audit. For some 
audits, tradition has prevailed, and for other audits, methods of com- 
puting e were developed for qmp. 

As an example, consider a single relay for which three contacts are 
defective (class B defect). The traditional method of computing e for 
an apparatus audit is 

e = (12) (0.005) = 0.06, 

where 12 is the number of contacts in the relay and 0.005 is a traditional 
generic standard per unit of count for c lass- B defects. The traditional 
rounding is down, so an = (0.06) + 3V0.06 = 0.79, rounded down to 
zero. Hence, one class-B defect is assessed. 

Another example is a reducible cluster of loose terminations found 
on a bay of equipment in a transmission installation performance 
audit. In this case, e is just the quality standard for the bay in defects 
per unit, and the rounding is to the nearest integer. 
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As you have gathered by now, defect assessment is an art not a 
science. The principles and rules described here have empirical valid- 
ity. In practice, they usually lead to reasonable assessments. 

3.2 Equivalent defects and expectancy 

A complicating factor in the analyses of audit results is that defects, 
defectives, and demerits are different. But, are they really? The answer 
is no; because, for statistical purposes, they can all be transformed into 
equivalent defects that have approximate Poisson distributions. 

Suppose we have a quality measure Q (total defects, defectives, or 
demerits). Let E a and V s denote the standard mean (called expectancy) 
and variance of Q. So the T-rate is T = (E a - Q)JV 8 . 

Now define 

X = equivalent defects = 



V s /E s 

and 

e = equivalent expectancy = standard mean of X 

E a _E\ 

V s /E s V,' 

If all defects have Poisson distributions and are occurring at 6 times 
the standard rate, then it can be shown that 

E[X\9] = V(X\0) = eO; 

hence, X has an approximate Poisson distribution with mean eO. 

As an example, consider the demerits case. The total number of 
demerits has the general form 

D - £ WiXi, 

where the Wi's are known weights and the X?B have Poisson distribu- 
tions. Assume that the mean of X t is e,0, where e, is the standard mean 
of Xi and is the population quality expressed on an index scale. So 
= 2 means that all types of defects are occurring at twice the rate 
expected. 

The mean and variance of D are 

E(D) = '£w i E(X i ) 

= 0E, 
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and 

V(D) = I wlV(Xi) 

= E w?M> 
= 0V 8 , 

where E s and V s are the standard mean and variance, respectively, of 
D. 

The mean and variance of equivalent defects, X, are 

E(D) 

EiX) = v7F. 

OEs 

Vs/E 8 

= 6e 
and 

V(D) 



V(X) = 



[V s /E s f 
0V.E1 



V 2 . 

= 6e. 

The mean and variance of X are equal; so, X has an approximate 
Poisson distribution with mean ed. Of course, it is not exact; because, 
X is not always integer valued. But, this Poisson approximation for 
equivalent defects is better than the normal approximation implied by 
the T-rate system. It is the Poisson approximation in qmp that obviates 
the need for the modification treatments discussed in Section 2.8. 

A similar analysis works approximately for the defectives case. So, 
any aggregate of demerits, defectives, or defects can be transformed 
into equivalent defects. Just use the standard expectancy and variance 
as illustrated above for demerits. 

3.3 Statistical foundations of QMP 

The algorithm used for computing the qmp box chart shown in Fig. 
la was derived from a Bayesian analysis of a particular statistical 
model. In this Section we describe the model and put it in perspective. 
This will provide an appreciation for how the box charts can be 
interpreted and why they are a useful management tool. 

3.3.1 QMP model 

For rating period t, let x t = equivalent defects in the audit sample, 
e t = equivalent expectancy of the audit sample, 9 t = population index, 
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as defined in Section 3.2. Based on the discussion in Section 3.2, we 
assume that the conditional disribution of x, given t is Poisson with 
mean efir, i.e., 

Xt\9i ~ Poisson(e,0,). 

In Fig. 3 we see that the season average varies from year to year. 
Some of that variation is due to the fact that the season is itself a 
sample from a conceptual infinite population of at bats. The rest of 
the variation is due to changes in ability, competition, etc., that are 
caused by numerous factors that may or may not be identifiable. The 
important concept is that the time series of season averages is a 
stochastic process. For qmp we assume that the time series of 0/s is an 
unknown stochastic process. 

For reasons that are partly statistical and partly administrative, we 
have decided to restrict our use of past data to five periods. The main 
administrative reason is that the T-rate system used the past five 
periods. So all of the T-rate administrative rules that dealt with 
missing data and reinitialization of rating classes can be used in qmp. 
Statistically, qmp works as well for six periods as it does for eight 
periods (one year). 

A consequence of using only six periods of data is that no useful 
inference can be made about possible complex structure in the sto- 
chastic process of t 'a. So we assume simply that the t 's are a random 
sample from an unknown distribution called the process distribution. 
Furthermore, six observations are not enough to make fine inferences 
about the family of this unknown distribution. So for mathematical 
simplicity we assume it to be a gamma distribution with unknown 
mean = and variance = y 2 (Appendix A); i.e., 



iid 

6, ~ Gamma 



(?• »)* 



t = l, . . . , T(current period). 

The parameters 6 2 /y 2 and y 2 /8 are the shape and scale parameters of 
the gamma distribution. We use the names 

= process average, 

y 2 = process variance. 

This choice of a unimodal distribution reflects our experience that 
usually many independent factors affect quality; so there is a central 
limit theorem effect. 

We are assuming that the process average is unknown but fixed. In 
reality, it may be changing. We handle this by using a moving window 
of six periods of data. But this treats the past data symmetrically. An 
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alternative would be some kind of exponential smoothing or Kalman 
filtering. My colleague M. S. Phadke is developing a generalization to 
qmp based on a random walk model for the process average. 

The model so far is an empirical Bayes model. 10 The parameter of 
interest is the current population index, 6t, which has a distribution 
called the process distribution. Bayesians would call it the prior 
distribution if it were known. But we must use all the data to make an 
inference about the unknown process distribution. So, the model is 
called empirical Bayes. 

Efron and Morris 10 take a classical approach to the empirical Bayes 
model. They use classical methods of inference for the unknown 
process distribution, qmp is based on a Bayesian approach to the 
empirical Bayes model. Each product has its own process mean and 
variance. These vary from product to product. By analyzing many 
products, we can model this variation by a prior distribution for 

(4 y 2 )- 
Summarizing, our model is 

x t \Bt~ Poisson(e,0,), t = 1, • • • , T, 

iid /$ 2 y 2 \ 

0t ~ Gammal —,-r), (0, y ) ~ prior distribution p(8, y 2 ). 

For now, p(0, y 2 ) remains general. 

This is a full Bayesian model. It specifies the joint distribution of all 
variables. The quality rating in qmp is based on the posterior distri- 
bution of Ot given x = (xi, • • • , xt). 

3.3.2 The model In perspective 

Quality rating in qmp is based on posterior probabilities given the 
audit data. Of course these probabilities depend on the model. But 
how do we know the model is right? 

It is important to understand that we are not doing data analysis 
with qmp. In data analysis, each set of data is treated uniquely. 
Probabilities cannot be computed. Objective decisions cannot be made. 

A requirement of quality rating is a specific rule that defines quality 
exceptions and a figure of merit (e.g., a probability) associated with an 
exception. A statistical model provides both, qmp could have been 
based on a more elaborate model. Our model represents a compromise 
between simplicity and believability. 

So our exception decisions are at least consistent with one simple 
model of reality. The probabilities are conditional on that model. 
Otherwise, they can only be interpreted as figures of merit. 

We have imbedded the simple hypothesis of a Poisson distribution 
with a standard mean into a class of alternatives. The alternatives are 
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Poisson distributions with nonstandard means. Much more compli- 
cated alternatives can be included: e.g., the class of negative binomial 
distributions, and our probabilities would change a little. But qmp has 
achieved a kind of empirical validity. The exceptions being identified 
are accepted by the managers being rated. And for the products 
declared normal, there is a model (i.e., our model) that affords the 
standard hypothesis some credence. 

3.3.3 Posterior distribution of current quality 

We show in Section IV that it is computationally impractical to 
derive the exact posterior distribution of Or- The best we can do is 
approximate the posterior mean and variance of 0t. 

The posterior mean and variance of 8 T are derived in Section IV. 
The posterior mean is 

§t= E(d T \x) 

= cor 6 + (1 — 6>t)It, 
where 

6 = E{e\s), 

cor — E(ut\x), 

fl/er 
UT ~d/e T +y 2 ' 

The posterior mean, 6 T , is a weighted average between the estimated 
process average, 6, and the defect index, I T , of the current sample. It 
is the dynamics of the weight, cor, that makes the Bayes estimate work 
so well. For any t, the sampling variance of I t is 



V(h\(h) = Vi- 
et 



e, 



= \v( Xl \e t ) 

= \ (efi.) 

e t 

= Ot/et. 

The expected value of this is 

E[0 t /e,] = 0/e t . 

So the weight, cor, is 

[expected sampling variance] 



[expected sampling variance] + [process variance] 
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If the process is relatively stable, then the process variance is 
relatively small and the weight is mostly on the process average; but 
if the process is relatively unstable, then the process variance is 
relatively large and the weight is mostly on the current sample index. 
The reverse is true of the sampling variance. If it is relatively large 
(e.g., small expectancy), then the current data is weak and the weight 
is mostly on the process average; but if the sampling variance is 
relatively small (e.g., large expectancy), then the weight is mostly on 
the current sample index. In other words, tor, is a monotonic function 
of the ratio of expected sampling variance to process variance. 

The posterior variance of 0t is 

V T =*= (1 - UT)0 T /e T + u 2 tV(6\x) + (6 - I T ) 2 V(w T \*). 

If the process average and variance were known, then the posterior 
variance of Or would be (1 — tor)0r/er (Appendix B). So the first term 
is just an estimate of this. But since the process average and variance 
are not known, the posterior variance has two additional terms. One 
contains the posterior variance of the process average and the other 
contains the posterior variance of the weight. 

The first term dominates. A large wr (relatively stable process), a 
small &t (good current quality), and a large er (large audit) all tend to 
make the posterior variance of Or small (the box chart short). 

If Cyr is small, then the second term is negligible. This is because the 
past data is not used much, so the uncertainty about the process 
average is irrelevant. 

If the current sample index is far from the process average, then the 
third term can be important. This is because outlying observations add 
to our uncertainty as to what is happening. 

If the process average and variance were known, then the posterior 
distribution would be gamma (Appendix B). So we approximate the 
posterior distribution with a gamma fitted by the method of moments. 
The parameters of the fitted gamma are 

a = shape parameter 

= 9 2 t /Vt, 

t = scale parameter 

= Vt/Ot, 

and the posterior cumulative distribution function is 

Pr{0 r <y|x}=G a (;y/T) 

(Appendix A). 
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Fig. 4— qmp box and whisker chart. This is a graphical representation of the posterior 
distribution for current production given the six most recent periods of audit data. The 
whiskers display the 99th and 1st percentiles and the box displays the 95th and 5th 
percentiles. The Best Measure is the posterior mean or Bayes estimate. It is a weighted 
average of the process average ("dot") and the current sample ("X"). The weight is the 
ratio of sampling variance to total variance. If all the variance is due to sampling, then 
the production is stable and the process average is the Best Measure of current quality. 
If the sampling variance is zero, then the current sample is the Best Measure. 



3.4 QMP reports 

3.4.1 QMP box chart 

The qmp box and whisker chart is shown in Fig. 4. 799%, • • • , 701% 
are defined by 

1 - G„(799%/t) = 0.99, 

1 - G q (701%/t) = 0.01. 

So, e.g., a posteriori, there is a 99 percent chance that T is larger than 
799%. 

3.4.2 QMP Below Normal and ALERT definitions 

In qmp, a rating class is Below Normal (bn) if 

799% > 1; 

i.e., the posterior probability that the product is substandard exceeds 
0.99. Substandard means $t > 1. A rating class is on alert if 

799% < 1 < 795%; 

i.e., the posterior probability that the product is substandard exceeds 
0.95 but not 0.99. 

These definitions are illustrated graphically in Fig. 5, which is 
oriented like the location summary in Fig. 6. 
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Fig. 5 — qmp exceptions. Below Normal means that the probability of substandard 
quality exceeds 0.99. For alert, the probability exceeds 0.95 but not 0.99. Normal is not 
a quality exception. 



3.4.3 QMP report formats 

There are two report formats for qmp results. One is a time series of 
box charts illustrated in Fig. la. The estimated process averages are 
joined. The other is a location summary for the current rating period. 
This is illustrated in Fig. 6. It orders the rating classes by Best Measure 
for the current period. Another ordering that will be used is by rating 
class name. 

Western Electric, Bell Laboratories, and American Telephone and 
Telegraph management will receive all qmp results. Operating com- 
pany management will receive qmp results on those rating classes that 
are of direct interest to them. Examples of results provided to the 
operating companies are the quality of repaired telephone sets and 
installed switching systems. 

3.5 Advantages of QMP 

Many of the advantages of qmp relate to the disadvantages of the T- 
rate system (Section 2.8). qmp provides a direct measure of quality. If 
a rating class is Below Normal, one can tell how bad the quality is. 
qmp uses past and current data to make an inference about current 
quality not past quality. If a rating class is on alert, then it is over 95 
percent probable that there is a quality problem now. qmp does not 
use runs criteria, but uses the actual equivalent defects observed. This 
provides more statisxtical efficiency and therefore shorter interval 
estimates, qmp is robust against statistical "jitter." It does not over- 
react to a few defects. Consequently, there is no need for special 
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modification treatments. This way we retain statistical objectivity 
conditional on our model. 

Another advantage compared to the T-rate is that qmp provides a 
lower producer's risk and consequently a more accurate list of excep- 
tions. This is supported by data presented in Section VI. 

Finally, qmp will allow us to unify our reporting to Bell System 
management. In the past, the T-rate statistic did not meet the needs 
of the operating companies; so, we developed a collection of special 
reports for the operating companies. Since the qmp report format does 
meet operating company needs, the relevant subset of all the results is 
a useful report. 

IV. MATHEMATICAL DERIVATION OF QMP 
4.1 Exact solution 

We are interested in the posterior distribution of Or given x, for the 
model described in Section 3.3.1. Now Pt{0t < v|x} = Jo Jo Pr {Or 
< y\0, y 2 , x T )p(6, y 2 | x ) <M dy 2 , where p(0, y 2 |x) is the posterior 
distribution of 0, y 2 given x. 

From Appendix B, we know that the distribution of 6t given 0, y 2 , 
and xt is gamma; so, Pr {Or < y\0 t y 2 , Xt) can be expressed in terms of 
an incomplete gamma function. 

By Bayes theorem, 

p(6, y 2 )L{0, y 2 ) 



P(0,Y 2 \*) = 



ff 



p(0,y')L(0,y')d0dy< 

where p{0, y 2 ) is the prior density of 0, y 2 and L(0, y 2 ) is the likelihood 
function. Since x t given t is Poisson and t given 0, y 2 is gamma, it 
follows that Xt given 0, y 2 is negative binomial; hence, 

L(0, y 2 ) = n LAB, y 2 ), 

where 

Xt'.L (0 /y ) 



m(0,y 2 ) = 



0/e, 



0/et + y 



So the posterior distribution of 0t is a complex triple integral that 
has to be inverted to compute the qmp box chart. The posterior mean 
and variance of 0t can be expressed in terms of several double integrals. 
There are more than 1,000 rating classes that have to be analyzed each 
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period, so computational efficiency is important. This is why we have 
developed an efficient heuristic solution to the problem. 

4.2 Empirical priors for process parameters 

It is clear from Section 4.1 that prior distributions for and y 2 are 
needed. In the fourth rating period of 1979, we applied an earlier 
version of the qmp algorithm to over 1,000 rating classes. This provided 
over 1,000 estimates of and y 2 and empirical distributions of these 
estimates. The empirical mean and variance of the estimates were 
0.75 and 0.17, respectively. The empirical mean, variance, and mode of 
the y 2 estimates were 0.28, 0.19, and 0.05, respectively. 

In the remainder of Section IV, we use 1 as a mean value of instead 
of 0.75. This is because 1 is the desired standard value that minimizes 
first cost plus maintenance costs. Under qmp, the shops will be able to 
operate on the average closer to 1, because the producer's risk (see 
Section 6.2) is smaller than for the T-rate. Also, more defects are 
assessed under qmp than for the T-rate (see Sections 2.2 and 3.1). 

4.3 Posterior mean of current quality 

For the model described in Section 3.3.1, 

d T =E(e T \x) 

~ E[E(0 T \0, y 2 ,x)\*l 

Conditioning on and y 2 means that the process distribution is known. 
So by Theorem B.l in Appendix B, 

T = E[u T + (1 - w T )/r|x]. (1) 

To calculate this posterior expectation exactly requires a double 
integral. But a posterior expectation, E[- |x], can be viewed as an 
estimate of the operand " • ", because it is the Bayes estimate. So all 
we need are estimates of tor and 0. 

4.3. 1 Moment estimates of process parameters 

As argued in Section 4.1, given 6 and y 2 , x t has a negative binomial 
distribution. We show in Appendix D, eqs. (56) and (58), that 

E(I<) = 0, 
E(Y t ) = y 2 - 
where 

Y, = (I, - 0) 2 - I,/e t . (2) 

So we have many independent estimates of and y 2 . A general 
method of combining independent estimates of parameters is a 
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weighted average, where the weights are proportional to the reciprocal 
of the variances of the individual independent estimates. Such esti- 
mates of 6 and y 2 are 

f-IM. (3) 



where 



and 



Y 2 - I q t Y t , (4) 



2j*-£a-i 



Pt ccl/V(I t ), 
q, oc l/V(Y t ). 

Notice that Y t depends on 0. So in the application, we replace by 
an estimate. 

Now V(I t ) and V(Y t ) depend on the unknown parameters 6 and y 2 . 
The important consideration in setting the weights p t and q t is their 
general behavior as e t varies. So for simplicity (to avoid iteration), we 
choose $ = 1 and y 2 = Vi, which are empirically-determined mean 
values of these process parameters (see Section 4.2). 

In Appendix D, we derive formulas for V(I t ) and V(Y t ). Plugging 
= 1 and y 2 = V* into eqs. (56) and (59) yields 



Pt oc f,= 



qt° c gt = 



1 



V(h) 



= 1 



V(Y t ) 



= 1 



1 1\ _ e t 


4' 




(5) 


e t A) 1 + et/ 


"2.5 1.5 n ~\ 
— 5- + — + 0.22 
e t e t 












e? 


/c\ 



2.5 + 1.5e, + (0.22)e?* 

Note that for small e t , ft<* e t ; but for large e t , the f t 's and therefore 
the weights, p/s, are all about equal. This is because for any large e t , 
It = 6t and we are trying to estimate the average of the t 'a. 

4.3.2 Bayes estimate of the process average 

In the case It = for all t, there is a problem with the estimate 6. If 
we plug 6= into (1), then Or = 0. But Qt is a posterior mean of a 
positive parameter, so it cannot be zero. The correct method of 
handling this problem is to start with a proper prior distribution on 
the process average, 0. But then the mathematics and the computa- 
tions become complicated. 
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So we assert that we have prior information that is equivalent to 
observing some "prior data," Xo and e . Then a Bayes type estimate 
has the form 

e=l Pl i t , (7) 

t=0 

which has the same form as the moment estimate, d, but uses all the 
data including the "prior data." 

To choose values for Xo and e , consider T = 1. A generic form of a 
Bayes type estimate of 9 is 

wE(d) + (1 - w)Ii, 
where 



w v(ii) + v(oy 

Setting this generic form equal to (7) yields 

E(d) = xo/co, 

V(0) = l/e + %. 

From Section 4.2, E(6) = 1; and we conservatively choose V{6) to be 
1.25 (we do not want our prior observations of estimates to preclude 
large future values of 6). This implies x = e = 1. 

4.3.3 Bayes estimate of weight 

Now define an estimate of y 2 analogous to (7), 

y\ = S Qt{L - 0) 2 - 2 qffl«). 

1=0 t=o 

as suggested by eqs. (2) and (4).* The first term is the total variance 
about the process average and the second term is a weighted average 
of estimated sampling variances. [Recall from Section 3.3.3 that 
V(It\6 t ) = Oi/et, which can be estimated by 7,/e,.] We denote the 
average sampling variance by 

a 2 = I q t (I,/e t ). (8) 

1=0 

The problem with yf as an estimate of process variance is that it 
can be negative. To solve this problem, we use the results in Appendix 
C. Assume a 2 is a known constant, and define the unknown weight as 



w = 



a 2 + y 2 ' 



* Note that we treat the "prior data" as real data. 
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To apply Appendix C, we must find a statistic, ss, and a degree of 
freedom, df, so that, approximately, 

U / ^ 2 

— (SS) - Xdf- 

a 

Originally, we just assumed approximate normality of I t and took ss 
= (df + 1) gE* qt(h - 0) 2 and df = T. But we found unusual sets of 
data for which the number of defects allowed (before declaration of 
Below Normal) was a decreasing function of expectancy in short ranges 
of small expectancy. We dubbed this the "qmp wiggle." 

To solve this problem, we approximate the sampling distribution of 

Z= lq t (I t -d) 2 

t=o 

by a scaled chi-square with degrees of freedom deduced by the method 
of moments. 
Let 

Si- Z qtil-0) 2 , 



and try an ss of the form 



uZ, 



where u is an unknown constant. The two moment equations that 
have to be satisfied are 



(9) 



E 


to 

-(uZ) 

a 


= df, 


E 2 [(u/o 2 )(uZ)] E 2 [ X h] 


V[(u/o 2 )(uZ)] V[ X Sf] ' 


And the second equation is 




2E\Z] 


df. 



V[Z] 



(10) 



Inspired by well-known normal theory, we use the approximations 

df 



fc ' <z, -\dr+i 

2E 2 (Z) 2E 2 (Z 1 ) 



E(Zi), 

- 1. 



V(Z) V(Zi) 

Now using eqs. (11) and (56), eq. (9) becomes 



(11) 
(12) 



E 



- 2 (uZ) 
a 



=?"(dfTr)^ 2 + ^'> 
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^"Whl *"+■'» 



df 
= u\ 



v df+l 

= df; 

hence, 

u = df + 1. 

As for eq. (12), the mean and variance of Zi depend on 6 and y 2 . So 
to avoid iteration, we now select 0=1 and y 2 = 0,* which were 
empirically determined in Section 4.2. Then by eqs. (12), (56), and (57), 

2E\Z] 2[£ <7,dM)] 2 

V[Z] £ <7?U/e? + 2/e?) 

= df. (13) 

So our statistic ss is 

ss = (df+l)Xg,(/,-0) 2 , 

where df is given by eq. (13). 

Now apply the Corollary to Theorem C.l in Appendix C to get 

to I ss ~ C — Gamma( a, — , 1 ), 
where 



df , , ss 

a = a + — , b = b + — . (14) 



Define 



S 2 = b/a 



T 

2 



2& +(df+l) X q,{I t -0) 

= ^ , (15) 

2ao + df 

R = S 2 /a 2 . (16) 

Now apply Theorem C.2 in Appendix C to get 

£( "' x) = SF=ii' (17) 



* The choice of y 2 = here may seem inconsistent with choice of y 2 = V* in the 
definition of f, and g,. There it was necessary to take a positive value (the empirical 
mean across products) of y 2 to get the correct behavior for large e,. Here it was not 
necessary, so for simplicity, we took the approximate empirical mode across products, 

7-a 
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V(«|x) = G. (18) 

To determine the parameters (ao, 60) of the prior distribution of u, 
we first develop an empirical distribution of estimated w's across many 
rating classes, which have a mean and variance of 0.6 and 0.03, 
respectively. To be conservative, we inflate the variance, shrink the 
mean, and select the prior mean and variance of o> to be 0.55 and 0.045, 
respectively. The parameters ao and 60 are then solutions to (see 
Appendix C) 

ttt; = 0.55, 
RoF 

G = 0.045, 

where F and G are defined in Theorem C.2 in terms of ao and Ro = 
bo/ao<J 2 . A numerical analysis yields 

a = 4.5, 

Ro = bo/aoo 2 = 1.6, 

or 

a = 4.5, (19) 

60 = (7.2)(j 2 . (20) 

Now we define an estimate, y 2 , of the process variance by 

2 



a 2 + f 



So by eq. (14) 

y 2 = FS 2 - a 2 

= (FR - l)o 2 . (21) 

This is our improvement of the moments estimate. The inflation factor 
F prevents y 2 from being negative or zero. It can be shown that if R is 
large, F is approximately one; but if J? is small, FR — 1 is positive and 
F is large. 

We are now in a position to estimate ut by 

"r = 2 . <* . (22) 

OT+ X 

where 

a\ = 6/e T> (23) 

and our approximation to (1) is 

T = u T 8 + (1 - u t )It. (24) 
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4.4 Posterior variance of current quality 

For the model described in Section 3.3.1, 

V T = V(0 T \ x) = E[V(0 T \ 0, y 2 , x)|x] + V[E(d T \ 0, y 2 , x)|x]. 

Conditioning on and y 2 amounts to the process distribution being 
known. So by Theorem B.l, in Appendix B, 

V T = E[(l - u T )E(d T \0, y 2 , x)/er|x] + V[u T (0 - / T )|x]. 

Conditioning on y 2 in the second term yields 

V T = E[(l - u t )E{6t\ 0, y 2 , x)/ct| x] 

+ E[V[u T (0-lT)\y 2 ,x]\x] 

+ V[Elu T {0 - It) I y\ x]x], (25) 

so the posterior variance has three components. 

4.4. 1 First component 

The first component is approximated by regarding the posterior 
expectation operator as an estimation operator, and it is 

(1 - u T )0T/e T . (26) 

4.4.2 Second component 

To approximate the second component, we first approximate V[ut(0 
- I T ) I y 2 , x]. Since wt depends primarily on y 2 and e T , we shall 
consider cor a constant. So 

V[u> T (0 - It) I y 2 , x] = io T V(B \ y 2 , x). (27) 

We use the approximation 

V(0|y 2 ,x) = V(0\y 2 ,0 = 6). (28) 

Now by eq. (57), 

V0\y 2 ,0) = V(lp t I t \y 2 ,0) 

= lp 2 V(I l \y 2 ,0) 

= !p 2 [y 2 + 0/e t l (29) 

Plugging eqs. (28) and (29) into eq. (27) yields 

T 

V[u T (0 - It) I y 2 , x] = u> 2 T I p 2 [y 2 + 0/e t ]. 



Again, treating the posterior expectation operator as an estimation 
operator, we get for the second component of eq. (25) 



T 
-.2 V ~2r "2 



at I P 2 [y + 0/et\. (30) 
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4.4.3 Third component 

For the third component in eq. (25), we first approximate E[u T (0 
- It) I y 2 , x] by io>(0 — It), where 

_ §/e T 

So the third component in eq. (25) is 

(9-I T ) 2 V(u T \x). (31) 

If we define 



then 



So 



where 





r T 


9/e T 
a 2 ' 




to 


a 2 




a 2 + y 2 ' 






rrW - A(u) 




(r r - 


-Dco+1 /J(W) - 


V(UT 


x) = 


= [/i'(to)] 2 V(a>|x), 



(32) 



(33) 



2 i "2 » 

a + y 



[(r r - l)co + If 

Equations (31), (33), and (15) imply that the third component of eq. 
(25) is 

rUe-Ir) 2 
[(r T -l)6+iy a (35) 

Putting eqs. (26), (30), and (35) together implies that the approxi- 
mate posterior variance of 0t is 

Vt = (1 — cor) (Ot/ct) 

+ *k>w +t ™ + i™iJ+ir a (36> 

4.5 OMP algorithm 

Here we summarize the qmp formulas. On the right side of the 
formulas are the section numbers or equation numbers where the 
formulas were derived. 
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The audit data for t = 1, • ■ ■ , T\s the following: 

Q, = Attribute quality measure in the sample, period t (total 

defects, defectives, or demerits), 
Est = expected value of Q, given standard quality, 
Vst = Sampling variance of Qt given standard quality. 

For each period compute the following: 

Equivalent defects: 



x t = 



Vst/Est 



(Section 3.2) 



Equivalent expectancy: 



e t = Est /Vst. 



For the "prior data" (t = 0), let x = e = 1. 
For t = 0, • • • , T, compute the following: 



(Section 4.3.2) 



Sample index: 

It =x t /e t , 
Weighting factors for computing process average and variance: 

e, 



ft = 



gt = 



1 + e t /4 ' 



e) 



2.5+ 1.5e, + (0.22)e?' 
Corresponding weights: 

p. =ft/2ft, 

qt =gt/Y.St- 
Over all periods t = 0, • • ■ , T compute the following: 
Process average: 

e = (E Pth) 

Degrees of freedom: 

M 2Q>(l/e,)] 2 

at lq 2 ,a/ef + 2/ef) L ' 

Total observed variance: 

(14.4)a 2 + (df+l)]>a-0) 2 



(5) 
(6) 

(5) 

(6) 

(7) 
(13) 



S z = 



9 + df 

(15), (19), (20) 
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Estimated average sampling variance: 

o 2 = 2 fff(it/ft). 
Variance ratio: 

R = S 2 /a 2 , 
FandG: 



df 

a = 4.5 + y, 



B = 2 T(i), T(0) = 1, 

i-O 



T(i) = T(i - 1) 



aR 

a + i 



F = 



G = 



B-V 



1 
RF 



a + 1 



-<*-i>-™ 



Current sampling variance: 
Sampling variance ratio: 
Process variance: 



ot ■ §/e T 



2 / S 

rr — ot/o 



y 2 = FS 2 - a 2 = (Ffl - l)a 2 , 

wt ■ o 2 t/(o 2 t + y 2 ), 
Weights: 

(3 = a 2 /(a 2 + y 2 ) = 1/FR, 

Best measure of current quality: 

0t m &t0 + (1 — ut)It, 

Posterior variance of current quality: 



V T = (1 - ur)(§ T /e T ) + c3r]£/>i 
rW-Jr) 1 ~ 



* + 7 



(8) 
(16) 

(14), (19) 



(17), (50) 
(18), (54) 

(23) 

(32) 

(21) 
(22) 

(34) 
(24) 



[(rr-D<o+l] 4 
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(36) 



Box chart percentiles: 

a-fa/Vr, (Section 3.3.3) 

t = Vt/t Tt 

199%, 195%, 105%, 101% defined by: 

1-G„(/99%/t)=0.99, 

(Section 3.4.1) 



1-G„(/01%/t)=0.01. 

V. QMP DYNAMICS 

The Best Measure and the box chart percentiles are nonlinear 
functions of all the data, so the dynamic behavior of these results can 
appear to be complex. But this complex behavior is desirable and can 
be explained. This section characterizes the fundamental dynamics of 
qmp by example. 

5. 1 Dynamics of sudden change 

Since qmp is partially based on a long run average, it is natural to be 
concerned about responsiveness of the box chart to sudden change. If 
there is a sudden degradation of quality, Quality Assurance would like 
to detect it. If the producer solves a chronic quality problem, they 
would like their exceptions to disappear. Figures 7 and 8 illustrate the 
qmp dynamics of sudden change. 

The history data in Fig. 7 is a typical history for a product that is 
meeting the quality standard. The equivalent expectancy of five is 
average for a manufacturing audit. The history is plotted on a T-rate 
chart along with six possible values for the current T-rate (labeled A 
through F). So the current period is anywhere from standard (T-rate 
= 0) to well below standard (Index = 3.24, T-rate = -5). 

The right side of Fig. 7 shows the six possible current results plotted 
in qmp box-chart form. The box chart labeled A is the result of 
combining current result A with the past five periods. The box chart 
labeled F is the result of combining current result F with the same 
past history. 

As you can see, the qmp result becomes alert at about T-rate = 
—3 (letter D) and becomes bn at about T-rate = —4 (letter E). For the 
T-rate method of rating, you would have a bn at T-rate = —3. The 
good past history has the effect of tempering the result of a T-rate = 
-3. 

It is informative to study the relative behavior of the current sample 
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Fig. 7 — Dynamics of sudden degradation. The six qmp box charts (labeled A through 
F) result from the analysis of six time series of data, which all have the same past 
history, but have different current values, as shown in the T-rate chart. A qmp alert is 
triggered at a T-rate of —3 (letter D) and a qmp Below Normal is triggered at a T-rate 
of —4 (letter E). So a good past history tempers an observed change. Notice that from 
A to F, the Best Measure swings towards the sample value. This results from increasing 
evidence of an unstable process (expected number of defects equals 5 for this chart). 



index, process average, and Best Measure as the current value goes 
from A to F. The current index changes a lot (from 1.00 to 3.24) and 
the process average changes a little (from 1.00 to 1.38), both in a linear 
way. The Best Measure also changes substantially, but in a nonlinear 
way. It changes slowly at first and then speeds up. This is because the 
weight is changing from 0.71 to 0.32. The weight changes, because as 
the data becomes more and more inconsistent with the past the process 
becomes more and more unstable, while the current sampling variance 
changes slowly in proportion to the process average. 

Figure 8 is the dual of Fig. 7. It illustrates the dynamics of sudden 
improvement. For the first five periods plotted, the process average is 
centered on an index of two. Then an improvement takes place and 
from the sixth period on, the sample index is at the standard value of 
one. 

For the first five periods plotted, the rate is bn four times and alert 
once. In the sixth period there is a sudden improvement and the 
sample index goes to standard. Immediately, there is a jump in the 
Best Measure and the rate is no longer bn. Because of the increase in 
process variance, the weight changes from 0.69 to 0.61, putting more 
weight on the current good result. The posterior variance stays about 
the same [6t gets smaller but (1 - a>r) gets larger]. 

For the next five periods the sample index stays at standard. During 
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these periods both the process average and the Best Measure gradually 
move up towards the standard. 

5.2 Bogie charts 

A Bogie chart is a graphical device for tracking quality assurance 
audit data during a rating period. Figures 9 and 10 are examples of 
Bogie charts. The vertical axis is an index scale and the horizontal axis 
is an equivalent expectancy scale. During the rating period, as the 
audit sample size builds up, the sample equivalent expectancy in- 
creases. So the horizontal axis can also be viewed as a time axis. 

The Bogie curves labeled alert and bn are plots of the indices in 
the current sample for which 795% and 799% (the 95th and 99th 
percentiles) are exactly one, respectively. So the Bogie curves depend 
on the past history. The past histories associated with Figs. 9 and 10 
have average indices of 0.92 and 4.89, respectively. The variance of the 
past histories were 0.69 and 5.36, respectively. 

To use the Bogie chart, you plot continuously through the period 
the sample index as a function of the equivalent expectancy in the 
sample (see Fig. 9). Anytime this plot falls below the alert or bn 
curve, the rate is alert or bn at the plotted equivalent expectancy. 
Then to bail the rate out, the plotted sample index must get above the 
Bogie curves before the end of the period. For example, in Fig. 9, if the 
period had ended at an equivalent expectancy of three, then the rate 
would be alert. If it had ended at an equivalent expectancy of five, 
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Fig. 8 — Dynamics of sudden improvement. As soon as the sample value becomes 
standard, the product is no longer in the quality exception report (expected number of 
defects equals 5 for this chart). 
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Fig. 9 — Index Bogie chart for a good past history. Equivalent expectancy is a measure 
of how many defects are expected in the sample; so, equivalent expectancy increases 
with sample size. During a rating period, as the sample size increases, one can track the 
observed sample index (dotted curve) and compare it to Below Normal and alert 
thresholds. 



then the rate would be bn. But the period ended at an equivalent 
expectancy of eight and there is no exception. 

The alert Bogie curve in Fig. 10 is interesting. It starts at zero, so 
you start the period on alert. The past history is so bad, that in the 
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Fig. 10 — Index Bogie chart for a substandard past history. The Below Normal and 
alert thresholds are very tight. At the beginning of the period, the product is on alert 
until proven otherwise. 
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absence of any current data the probability that the current quality 
will be substandard exceeds 0.95. 

5.3 Bogle contour plots 

For a fixed past history and current equivalent expectancy, there is 
a bn Bogie for the current sample index. If the sample index is worse 
than the bn Bogie, then the product is bn. Figure 11 is a contour plot 




Fig. 11 — Below Normal Bogie contour plot. If the past mean is 0.8 and the past 
variance is 0.7 (on an index scale), then the product is on the contour labeled 2.6. This 
means that if the current sample index exceeds 2.6, the product will be Below Normal 
(equivalent expectancy equals 5 for this chart). 
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of the bn Bogie for an equivalent expectancy of five. The axes are the 
mean and variance of the five past values of the sample index; i.e., 

Ip = d/5) S I" 
t-i 

S 2 P = (1/5) 2 (It-I P )\ 

t-1 

where I t is the sample index in past period t. For given values of I p and 
S 2 P , we used a standard pattern of It's to compute the Bogie. The 
results are insensitive to pattern. The dashed curve is an upper bound 
for Sp. 

To see how the contour plot works, consider an example. Suppose 
I p = 0.8 and S 2 P = 0.7. The point (0.8, 0.7) falls on the contour labeled 
2.6. This means that if the current sample index exceeds 2.6, then the 
product will be bn. The contour labeled 2.6 is the set of all pairs (I p , 
Sp) that yield a bn Bogie of 2.6. The T-rate associated with a bn Bogie 
of 2.6 is -3.6, as shown in Table II. 

This contour plot summarizes the bn behavior of qmp for an equiv- 



Table II — Index to 7-rate conversion 
table 
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alent expectancy of five. As I p gets larger than one, the bn Bogie gets 
smaller. If I p exceeds 1.6, then the bn Bogie is smaller than 2.34, which 
corresponds to a T-rate of —3. So in T-rate terms, bn triggers earlier 
than a T-rate of -3. 

For I p less than 1.4, as S 2 P gets larger, the bn Bogie gets smaller. 
This is because large Sp implies large process variance which makes 
an observed deviation more likely to be significant. 

For very small S 2 P , as you move from T p = 0to I p = 1, the bn Bogie 
increases from 2.6 (T-rate = - 3.6) to 2.9 (T-rate = -4.2). This is an 
apparent paradox. The better the process average, the less cushion the 
producer gets. 

This is not a paradox, but an important characteristic of qmp. 
Remember with qmp we are making an inference about current quality, 
not long-run quality. If we have a stable past with I p = 0.2, and we 
suddenly get a sample index of 2.7, then this is very strong evidence 
that the process has changed and very probably become worse than 
standard. If we have a stable past with I p = 1, and we suddenly get a 
sample defect index of 2.7, then the evidence of change is not as strong 
as with I p = 0.2. The weight we put on the past data depends on how 
consistent the past is with the present. 

Notice that the maximum bn Bogie is 2.92 and occurs at I p = 0.85 
and S 2 P = 0. It would be a mistake for the producer to conclude from 
the contour plot that he should control his process at I p = 0.85 and S 2 P 
= 0. He cannot achieve S 2 P = 0. The sample index has substantial 
sampling variance that the producer cannot control. 

The Bogie contour plots provide the engineer with a manual tool to 
forecast the number of demerits that will be allowed by the end of a 
period. So we have published a book of bn and alert Bogie contour 
plots for equivalent expectancies from 0.5 to 25. 

5.4 Nonlinearity of QMP 

It is tempting to conjecture that if both the process average and the 
current sample index for one rate are worse than for another, then the 
Best Measure will also be worse. This is because the Best Measure is 
a weighted average between the process average and the current 
sample index. But, since the weight depends on the data nonlinearly, 
the conjecture is not true. 

To illustrate this, consider Fig. 12. The six sample indices in Chart 
B are uniformly worse than the six sample indices in Chart A. But the 
Best Measure in Chart B is better than for Chart A. The reason is that 
the weight in B is 0.54 vs 0.12 for Chart A. 

VI. OPERATING CHARACTERISTICS 

The T-rate and qmp methods of rating are similar in some respects, 
but there are major differences. In this section, these differences are 
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Fig. 12— Nonlinearity of qmp. The sample indices in Chart B are uniformly worse 
than the sample indices in Chart A; but, the qmp result in Chart B is better than for 
Chart A. In Chart A, the data provides very strong statistical evidence of an unstable 
process, so the past data is used very little in estimating current quality. This is not as 
pronounced in Chart B. 

explored using operating characteristics. The differences are a result 
of different rating formulas and assessment practices. 

6. 1 Ranges of probability substandard for J -rate exceptions 

A qmp analysis of a rating class provides a probability, ps, that the 
rating class is substandard. For a typical rating period analyzed in 
detail, we computed ps for all T-rate Below Normals and alerts. 
Table III shows the results. 

So we find that for T-rate bns, the qmp ps is typically high (greater 
than 0.97); but, there can be an occasional low ps (e.g., 0.75). However, 
for T-rate alert's, the qmp ps is frequently low (e.g., 0.85). This is 
because the T-rate alert is an indicator of long-run quality, not 
current quality. 

Table III — Ranges of probability substandard for 
7"-rate exceptions 



Exception 



Range of 
Probability 



Outlier 
Probability 



Below Normal 
ALERT 



0.97 -* 1.00 
0.83 -> 0.99 



0.75 
0.59 
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6.2 Producer's risk and exceptions 

Any list of rating classes that is put in an exception report has a 
producer's risk. It is the fraction of rating classes on the list whose 
population quality meets the standard. For a given period, let 0, = 
population index, rating class i, i = 1, • • • , J. Label the rating classes 
so that product 1 through product L are on the exception list. 

Having done qmp for each rating class, we have a posterior distri- 
bution for each ft. Now let 

iffc<l, 
otherwise. 



"'[• 



The number of rating classes on the list whose population quality 
meets the standard is 

L 

S Ui, 

i-l 

with posterior expected value 

SPr{0,<l}. 

i-l 

Hence, l 

2 Pr{0.<l} 

[producer's risk]* — — - . 



In qmp, there is an exception list for each threshold probability (tp). 
tp = 0.95 corresponds to the list of all qmp bns and alerts. Figure 13 
shows the qmp producer's risk and number of exceptions as a function 
of tp for the manufacturing audits in a particular period. The smaller 
tp, the bigger the exception list and the bigger the producer's risk. 
Also, note that the producer's risk must be less than 1-tp. 

The set of all T-rate bns and alerts is another exception list, whose 
producer's risk is 0.037. This is relatively large because some individual 
alerts have relatively large probabilities (e.g., 0.15) of being standard. 
The number of T-rate exceptions (bn + alert) is shown to be 34. 

Of course to implement qmp, a particular tp had to be chosen. The 
tp that would match the T-rate producer's risk is about 0.885. But 
that would lead to an unreasonable (70 percent) increase in exceptions, 
and a producer's risk of 0.037 is considered too high for this type of 
exception reporting, because of the high cost of false alarms. So we 
took tp = 0.95, a reasonable balance between producer's risk and size 
of the exception report. 



This is not the classical definition of producer's risk. 
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Fig. 13 — Operating characteristics of qmp versus the T-rate. As the QMP threshold 
probability for exceptions (currently set at 0.95) is lowered, the number of exceptions 
and the producer's risk (for a particular rating period) both increase. The number of 
exceptions and producer's risk for the T-rate were 34 and 0.037. For threshold probabil- 
ities between 0.96 and 0.89, qmp has more exceptions and lower producer's risk than the 
T-rate. 



It should be recognized that these curves depend on the particular 
set of audits being analyzed. For example, the curves depend on the 
audit sample sizes. It would be possible to lower sample sizes, decrease 
the threshold probability, and still maintain a comparably sized excep- 
tion report with a reasonable producer's risk. 

Note that consumer's risk is not analyzed in this paper. Consumer's 
risk is more relevant to acceptance sampling than to an audit. The 
main purpose of the audit is to provide quality results to management 
including a compact exception report of high integrity. The Western 
Electric quality control organizations have primary responsibility for 
the quality of each individual lot of product. 

VII. EXAMPLES OF QMP 

Here we explore specific examples that illustrate the similarities and 
differences between qmp and the T-rate. In the examples, both qmp 
and T-rate results are based on the same defect data. For the actual 
implementation of qmp, the defect assessment rules will be different 
than they are for the T-rate as explained in Section 3.1. The intent of 
this section is to compare how the two rating methods work on the 
same data. 
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The examples are shown in Figs. 1, 2, and 14 through 17. These 
figures show a comparison between the time series of T-rates and qmp 
box charts. Table IV contains summaries of the qmp calculations for 
the particular periods that will be discussed in the following text. 

The qmp calculations shown do not use 1976 data. The box chart for 
the first period of data available is not shown except in Fig. 15. Period 
7706 is the first period for which five periods of past data are used in 
the qmp box charts. So the comparisons made in this section will 
involve periods 7706 through 7808. 

7. 1 Agreement with T-rate 

Figure 14 illustrates a T-rate borderline* in 7806 preceded by a good 
history. Since the equivalent expectancy (2.78) is fairly small and the 
process is fairly stable, the Best Measure (1.81) is heavily weighted 
(0.65) towards the process average (1.32). The posterior variance (0.36) 
is fairly large, so 195% is better than standard. However, in the next 
period, the T-rate plummets to —4.8 and the process average drops to 
1.77. Now the rate is clearly bn. 

7.2 Disagreement with T-rate 

In Fig. 15, 7802, the T-rate is —3.8 (bn) but there is no exception for 
qmp. One reason is that qmp is based on the assumption that equivalent 
defects have a Poisson distribution. A T-rate of -3.8 is very significant 
for a normal distribution, but not as significant for a Poisson distri- 
bution with an equivalent expectancy of 0.29. For a normal distribution, 
the probability, given standard quality, of being below -3.8 is 0.000072. 
Now the observed number of equivalent defects in 7802 is 2.36. The 
approximate Poisson probability of exceeding 2.36 equivalent defects 
given an equivalent expectancy of 0.29 is 0.15 — very different from 
0.000072. 

Another reason is that the qmp result for 7802 is based on one period 
of data. Rather than using the sample defect index (8.00) as the process 
average, we use a Bayes estimate [eq. (7)] of 2.77. 

Figure 16 is a similar example. In 7708 the T-rate of —2.8 is bn 
because in 7705 the T-rate was -2.7. But again, the -2.8 T-rate 
overstates the significance. The equivalent expectancy is only 0.23. 
Also, the weight (0.60) on the process average (1.81) adjusts the sample 
index (6.81) to the more moderate Best Measure (3.83). This, together 
with the large posterior variance (8.47), implies a comfortable 795% of 
0.56. 

Figure 1 illustrates how two similar T-rates, both on alert, can be 
either a qmp bn or normal. Compare 7708 with 7804. The sample 



* -3 < T-rate < -2, but a good history. 
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indices of 1.57 and 1.50 are very similar, but the process averages of 
2.00 and 1.32 are very different and the weights of 0.51 and 0.67 are 
different. Hence, the Best Measures are very different and the conclu- 
sions are very different. 

Figure 2 illustrates a "weak" alert under the T-rate. The T-rate in 
7705 through 7707 are -0.1, -0.2, and -0.1, respectively. Although it 
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Fig. 14 — Example of agreement. Throughout 1978, qmp and the T-rate are in agree- 
ment. The drop in the sixth period was called "borderline" under the T-rate, because it 
was the first excursion below -2 and it was moderate. The qmp box chart conveys the 
same borderline message. In the seventh period, the product was Below Normal for 
both systems. 
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Fig. 15 — Poisson versus Gaussian assumption. In the second period of 1978, the 
expected number of defects in the sample was 0.29 and the observed number of 
equivalent defects was 2.36. Under the Gaussian assumption, the observed significance 
level is 0.000072 (i.e., T-rate = -3.8); but, under Poisson, the level is 0.15. This explains 
why the qmp box chart contains the standard. 



is unlikely that the quality standard was being met in every period 
from 7702 through 7707, it is not unlikely (probability of 0.23) that the 
quality standard was being met in 7707. 

7.3 Modification treatment 

The T-rate system had modification treatments that resulted from 
the statistical deficiencies of the T-rate (see Section 2.8). There are no 
modification treatments in qmp. The Poisson model and the stabilizing 
effect of shifting the sample index towards the process average alleviate 
the need for modification treatments. 
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In Fig. 17, the 7807 unmodified T-rate is -2.5. It is modified to +0.6 
because of the "isolated" A weight (100 demerits) defect. Under qmp, 
the process average (1.10) is only slightly substandard, the weight 
(0.51) is medium, and the equivalent expectancy (1.40) is small. All 
this implies a safe 795% (0.74) without modification. 



7.4 Venn diagram of BNs and ALERTs 

In the Venn diagram of Fig. 18, bns are shown by circles and alerts 
are shown by rectangles, qmp results are shown by dashed lines and T- 
rate results are shown by solid lines. Every rating class that is bn or 
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Fig. 16— Statistical jitter in the T-rate. With small samples and zero defects, the T- 
rate is slightly larger than zero. Every time a defect is found, the T-rate jitters. The 
message in the qmp chart is that there is too much uncertainty to reach any conclusions. 



QUALITY MEASUREMENT PLAN 263 



1977 
12 3 4 5 6 7 



1978 
12 3 4 5 6 7 



a 



2 - 



3 - 



iiiiiii i i i n * i t i t 

y -lii t 




X " ] 
i 



4 


(b) 




2 









— ^V^ 


-4 


e 

-2.5 

l"»i 



2 3 4 5 6 
1977 



7 8 1 



3 4 5 6 
1978 



Fig. 17 — A case of T-rate modification treatment. Because the T-rate is biased for 
small samples, modification treatments were needed to compensate (seventh period, 
1978). qmp mathematics obviates the need for modification. 



alert under qmp or the T-rate is represented in the Venn diagram. 

Ten rating classes were bn under both methods of rating. Five rating 
classes were bn under the T-rate but alert under qmp. There were 16 
rating classes that were alert under the T-rate but normal under 
qmp. This indicates a major difference, alert under the T-rate is 
strong evidence that the quality standards for the current period or 
some of the past periods have not been met. But it does not necessarily 
imply strong evidence that the quality standard for the current period 
has not been met. alert under qmp implies more than a 95 percent 
chance that the current quality standards have not been met. 
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Fig. 18 — Venn diagram of exceptions. The Venn diagram accounts for all qmp and T- 
rate exceptions for a particular period using defects assessed under the T-rate. The lists 
of alerts under the two systems are quite different (only 10 out of 32 in common). 
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APPENDIX A 
The Gamma Distribution 
A random variable Y has a standard gamma distribution if 

PT{Y<y) = G a (y)= I =j-r*-V-*dr, 

Jo r < a > 

a = shape parameter. (37) 

A random variable X = tY has a gamma distribution with shape 
parameter a and scale parameter t. We write 

X — Gamma(a,'T) 

and 



Pr{X 



..>-«£). 



The probability density of X is 

1 



tT(«) 
The mean and variance of X are 

E(X) = to, V(X) = r 2 a; 

hence 

a = E 2 (X)/V(X), t = V(X)/E(X). 

A chi-squared random variable with v degrees of freedom has a 
Gamma distribution; namely, 



X, ~Gamma(-, 2). 



APPENDIX B 

The Poisson-Gamma Bayeslan Model 

Theorem B.l: Assume 

x t \0t ~ Poisson(e*&), e t known, 6 t unknown 
and 
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6, ~ Gamma 

or" v J 

(i.e., mean = 6, variance = y 2 ). 



'(?• t) 



Then 



where 



8t\x t ~ Gamma 



(a %\ 



6, = E{0 t \x t ) 

= u t + (1 - co,)/,, 
/, = Xt/et, 

B/et 
Ul -JJe7^7' 

V,= V(6 t \x t ) 
= (1 - u t )fft/e t . 
Proof: The sampling distribution of x t is: 



( e ,fl,)*exp{-e,fl,} 
/(*< I ft) = -j • (38) 

Xt\ 



The process (prior) distribution of t is*: 

Po<&)-^fl*-V-«* (39) 

1 (*o) 

= xo/eo, y 2 = xo/etl 

By Bayes theorem, the posterior density of B t is proportional to the 
product of equations (39) and (38), which is in turn proportional to 

[0?- 1 e- e ' ,0 'We- e ' '] - e x t 0+Xl - 1 exp[-(e + *«)&]. (40) 

We recognize eq. (40) as proportional to a Gamma density. So the 
posterior distribution is Gamma with shape parameter xq + x t and 
scale parameter l/(e + e t ). And the posterior mean and variance are 

ft .fSL±* f (41) 

eo + e< 

vi«- ^-- (42) 

e + e< 



* Here, x and e are not the same as the "prior data" introduced in Section 4.3.2. 
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Now multiply the numerator and denominator in both eqs. (41) and 
(42) by e/eoe t . Theorem B.l follows. Q.E.D. 



APPENDIX C 

Chl-Square, Gamma Bayesian Model 

Theorem CI: Assume there is a statistic, ss, for which 



u 

— (ss) 
a 



w ~ X» (chi-square, v degrees of freedom) 
a 2 known, w unknown 



and 



Then 



where 



to ~ Gamma a , — ) , ao, bo known. 



u I ss — Gamma 



a = ao + -, 

ss 
6 = 6 + T . 



(**)■ 



Proof: The sampling density of ss given w is 

1 



/(sslco) = 



(2a 2 /(o)- /2 r(^/2) 

The prior density of to is 

1 



^'""■^[-fe)]- <43) 
""""""[-tv*)} (44) 

By Bayes theorem, the posterior density of w is proportional to the 
product of eqs. (43) and (44): 

W-fe)]){"^[-fe)]} 



Po(w) =7-5 



(av&ornoo) 



= w a_1 exp 



o*/b 



Q.E.D. 
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Definition-. Let X ~~ Gamma (a, t). Denote the conditional distribution 
of X given X < c by 

C — Gamma(a, t, c). 



Corollary: If instead, 



w ~ C - Gammal a , — , 1 ), 



then 



w | ss ~ C - Gammal a, — , 1 ) . 



Theorem C.2: If to ~ C - Gamma(a, a 2 /6, 1), tfien 



where 



V(u) = G, 



i? = 



F = 



b/a 
a 2 ' 
Ga(afl) 



G„ + ,(a/J) 



[see (37)], 



G "^F 



a+ l\ G a+2 (aR] 



1 



a RGa+i(aR) RF 



(45) 



Proof: Note 



6 / 1 

— 5 w = affco ~ C - Gammal a, 1, — - 
a 2 V ai? 



So 



£(u>) = — -E(aRu) 
aR 



1 



aRG a {aR) 



y- l e- y J 



T(a) 



T(a+ 1) 



afi 



,<a+l)-l -y 



ai?G a (a#)r(a) J n T(a + 1) 

aGg +1 (aR) 
" aRGa(aR) 

= l/RF. 



dy 



(46) 
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Now 



^"Sfl?^*^ 



i r* 

G a (aR)) 



a-\ -y 
" aR „(a+2)-l e -y 



J" a " (a+2) 



(atf) 2 G fl 

T(a + 2) 
(ai?) 2 G a (ai?)r( 

(a 4- l)aG a+2 (afl) 
(aR) 2 G a (aR) 

(a + l)G a+1 (aR)G a+2 (aR) 
aR 2 G a (aR)G a +i(aR) 
a + l\ G a+2 (aR) 



+ 2) 



dy 



a J R 2 FG a+ i(aR)' 
This along with eq. (46) implies V(u>) = G. 

Computational formula for F 
Let 



ga(x) = 



T(a) 



x a - l e~ x dx. 



From Ref. 11, page 262, 6.5.21, 



G« + i(x) = G a (x) - ( - ) g a (x). 



Now define 



B a (x) = V T(i), T(o) - 1, 

i-O 



T(i) = T(i -1) 
By Ref. 12, page 3, 



a+i 



= 1 + 



a + 1 (a + l)(a + 2) 



G.(x) = (-| g a (x)B a (x). 



Putting eqs. (48) and (49) together implies 



(47) 



(48) 



(49) 



G a+ i(x) 
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{x/a)g a (x)B a (x) 



(x/a)g a (x)B a (x) - (x/a)g a (x) 

B a (x) 



Baix) - 1 



So 



F = 



Bq(aR) 
B a (aR)-l 



(50) 



Computational formula for G 
Directly by definition, it follows that 

'a + 1 



B a+1 (x) = 



[B a (x) - 1]. 



(51) 



Therefore, 



F a+ i(x) 



m 1 - 



B a+ l(x) 



= 1 - 



[(a + l)/x][B a (x) - 1] 

^(dri)^- 11 - 



(52) 



Now plug eq. (52) into the first term in the square bracket of eq. 
(45) and get 

'a + l\ G a+2 (aR) _ 

i(aR) 



a I RGa+AaR) 



.(±±1) L 



a + 1 Uu- 



a / R 
a + 1 



aR 

a + 1 



(F-l) 



aR 



-(F-l). 



So 



G = 



RF 



*)-'-» -A 



(53) 



(54) 



APPENDIX D 

Moments of functions of the sample index 

If 
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x t I B t ~ Poisson(e,&), 

(O 2 y 2 \ 
—, j J, 



then x t \0, y 2 ia a negative binomial with density 

[ 1 TT 6/e t y* 

[l + e/ety 2 } |_1 + 



f/ritf -■•^- rte + f / r ,) 



-.fl2/^2 



tfcf. 



eW 



Let 



/ii = mean of x t , 
/x„ = yth central moment of x t , 
Then according to (Ref. 11, page 929), 
jui = aP, 

H = <*PQ, 

Pa = aPQ[Q + P], 

M4 = aPQ + A(oPQ) 2 , 



" = 2, 3, 



where 



a = 2 /yV 

P=y 2 e t /e, 
Q = i + P, 

A = 3 + 6y 2 /0 2 . 



Now let 



£i = mean of/,, 

£„ = i*th central moment of I t , v = 2, 3, 
It follows from (55) that 

&-Y 2 + -, 



_2y 4 3y 2 



e, eV 



, 4 2A0y 2 A0 2 + y 2 6 



(55) 



(56) 



* A different a from the one in the main text. 
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An application of these formulas is 

V[(It - 0) 2 ] - & - fl 



= </i - ljy 


T T 2 


Now define 


Y t = (It - d) 2 - Itle t . 


Further applications of (56) are 


E(Y t ) « & - &/* 


= Y 2 


and 


V(Y t ) = EiY 2 ) - y 4 


= E 


(/,-») 2 --(/ t -^)-- 


2 

-Y 4 


iA , x 4 . [2(A-1)^ 2 -4 Y V^] 

= (A - l)y 4 + 

w 


+ 


[(A - 1)0 2 - 4y 2 ] 





2 4- ^ 2 1 ^ 

7r (57) 



(58) 



(59) 
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